FMT Project

Submitted by: Harsh Pundhir

CONTEXT

A complex modern semiconductor manufacturing process is normally under constant surveillance via the monitoring of signals/ variables collected from sensors and or process measurement points. However, not all of these signals are equally valuable in a specific monitoring system. The measured signals contain a combination of useful information, irrelevant information as well as noise. Engineers typically have a much larger number of signals than are actually required. If we consider each type of signal as a feature, then feature selection may be applied to identify the most relevant signals. The Process Engineers may then use these signals to determine key factors contributing to yield excursions downstream in the process. This will enable an increase in process throughput, decreased time to learning and reduce the per unit production costs. These signals can be used as features to predict the yield type. And by analysing and trying out different combinations of features, essential signals that are impacting the yield type can be identified

DATA DESCRIPTION

The data consists of 1567 examples each with 591 features. The dataset presented in this case represents a selection of such features where each example represents a single production entity with associated measured features and the labels represent a simple pass/fail yield for in house line testing. Target column “ –1” corresponds to a pass and “1” corresponds to a fail and the data time stamp is for that specific test point.

PROJECT OBJECTIVE

To build a classifier to predict the Pass/Fail yield of a particular process entity and analyse whether all the features are required to build the model or not

Solution

1. Importing libraries

1.1 Exploring Dtypes

1.2 Dealing with null values

2. Dealing with Curse of multidemsionality

2.1 Lasso feature reduction

3. Visualization

3.1 Univariate analyis

3.2 Bi and multi variate analysis

4. Model training and cross validation

5. Model Tuning - Logistic Regression

With scaling

With scaling and PCA

6. Model Tuning - Naive Baye's (Gaussian) Model

With scaling

With scaling and PCA

7. Model Tuning - Decision Tree Model

With scaling

With scaling and PCA

8. Model Tuning - K nearest neighbour model

With scaling

With scaling and PCA

9. Model Tuning - Random Forest classifier

With scaling

With scaling and PCA

10. Model Tuning - Bagging Classifer model

With scaling

With scaling and PCA

11. Model Tuning - Boosting classifier model

12. Model Tuning - Gradient boost classifier

13. Model Tuning - Linear regression

14. MODEL EVALUATION

14.1 Picking the best Model

15. Testing future data

15.1 Prediction

16. Handling impbalanced data - SMOTE

17. CONCLUSION

Model Name __ : __ Score
Final model analysis
THANK YOU

------------------------------------------------------ END -----------------------------------------------------------------